# A Reconfigurable Multi-Core Architecture to Support SPMD Applications

#### John K. Antonio

Military and Aerospace Programmable Logic Devices (MAPLD) Conference

Annapolis, Maryland September 15-18, 2008

#### Overview

- Drivers for multi-core technology path
- Proposed framework for reconfigurable multi-core architectures
- Illustrative analysis
- Conclusions



# Near term drivers for multi-core technology path

- Single-core path leading to increased cost, heat, and power consumption
- Single-core path widens the pocessor/memory speed gap
- Multi-core path transparent to many application domain developers
- Multi-core path can improve performance
  of threaded software

# Typical multi-core architecture\*





\*L. Chai, Q. Gao, D.K. Panda, "Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System," *Seventh Int'l Symposium on Cluster Computing and the Grid (CCGrid)*, Rio de Janeiro - Brazil, May 2007.

Future drivers and requirements for multi-core architectures

 Scale to support massively data parallel (SPMD) applications

 Match coupling among cores with application granularity



### Proposed architectural framework



# Shared everything configuration





Reconfigurable logic

# Shared nothing configuration





Reconfigurable logic

# Hybrid configuration





Reconfigurable logic

#### Features of proposed architecture

- Match core coupling and core processing capacity with application granularity
  - Fixed multiprocessor architecture not well matched with all application granularities
  - Proposed reconfigurable multi-core architecture can be configured to match core coupling with application granularity







### **Illustrative Analysis**

#### Notation

- Number of cores: c
- Problem size: n
- Sequential time complexity:  $T_s(n)$
- Parallel time complexity:

$$T_P(c,n) = K \times f(c,n) + L \times g(c,n)$$

- Computational complexity: f(c, n)
- Communication complexity: g(c, n)
- Core coupling ratio: K / L

#### Example

Sequential Time:  $T_s(n) = n$ 

Parallel Time: $T_P(c,n) = K \times (n/c) + L \times \log c$ Speedup: $S = \frac{n}{K \times (n/c) + L \times \log c}$ 

The value of *K*: related to core processing capacity



The value of *L*: related to interconnection among cores

#### K = 1.0, L = 1.0





Computer Science, University of Oklahoma

#### K = 1.5, L = 0.5





Computer Science, University of Oklahoma

K = 0.5, L = 1.5





Computer Science, University of Oklahoma



**Computer Science**, University of Oklahoma

### Conclusions

- Current multi-core approaches may not scale to support massive parallelism
- Proposed reconfigurable multi-core approach enables trades between core coupling and core processing capacity
- More research needed in reconfigurable micro-architecture to support proposed framework

